Practical Relevance Ranking for 10 Million Books

نویسنده

  • Tom Burton-West
چکیده

In this paper we briefly describe our production environment and some of the open questions about relevance ranking for 10 million books. Then we describe our participation in the Prove It task of the INEX Social Book Search Track. We found that the queries supplied with the Prove It topics were not specific enough to provide good retrieval results. In contrast, the fact fields of the topics, when used as queries, provided good retrieval results. However, our query logs show that users are unlikely to enter queries as long as the fact fields. We tried to create queries that provided good retrieval results but better represented the queries in our logs. We also experimented with simulating the two-stage search process used in our production system when searching the entire corpus of 10 million books to find relevant books and then searching within the book to find relevant pages. While we succeeded in creating queries that were more specific than those supplied in the Prove It topics, and those queries produced better results, questions remain about how representative these created queries are of real user queries.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bibliometrics in Online Book Discussions: Lessons for Complex Search Tasks

Online book discussion forums provide rich information on how readers think about and describe books, how books are related to other books and how people search for and recommend books. Within the Social Book Search (SBS) Lab at CLEF we analyse book search requests on the LibraryThing forums and find several types of complex search tasks where bibliometrics naturally combines with information r...

متن کامل

Focused Search in Books and Wikipedia: Categories, Links and Relevance Feedback

In this paper we describe our participation in INEX 2009 in the Ad Hoc Track, the Book Track, and the Entity Ranking Track. In the Ad Hoc track we investigate focused link evidence, using only links from retrieved sections. The new collection is not only annotated with Wikipedia categories, but also with YAGO/WordNet categories. We explore how we can use both types of category information, in t...

متن کامل

Spatially - Aware Information Retrieval on the Internet

In this report, we describe a practical relevance ranking procedure, as it is implemented and integrated in the interim prototype of the SPIRIT search engine. We review the theoretical models and ideas presented in the previous three deliverables of WP5, and state the practical decisions and refinements made during implementation. Possible improvements are identified which will lead to an advan...

متن کامل

Répondre à des requêtes cliniques PICO

In this paper, we address the issue of answering PICO (Patient/Problem, Intervention, Comparison, Outcome) clinical queries. The contributions of this work include (1) a new document ranking model based on a prioritized aggregation operator that computes the global relevance score based on the relevance estimation of the semantic facet sub-queries and (2) leverages the importance of the facets ...

متن کامل

مقایسه جایگاه فعالیت های آزمایشگاهی در کتب درسی زیست شناسی ایران و انگلستان

The aim of this study was to Comparative position of laboratory works in biology textbooks in Iran and United Kingdom. Therefore  were selected as sample the biology text book in  tenth-grade of both countries, and were analyzed based on the assessment criteria of laboratory works, then was determined the absolute frequency, relative frequency and Percent of relative frequency of each...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012